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Abstract 

m , 

We study the density of the weights of Generalized Reed-Muller codes. Let RM p (r, m) 

denote the code of multivariate polynomials over F p in m variables of total degree at 

most r. We consider the case of fixed degree r, when we let the number of variables m 

tend to infinity. We prove that the set of relative weights of codewords is quite sparse: 

for every a G [0, 1] which is not rational of the form there exists an interval around 

a in which no relative weight exists, for any value of" m. This line of research is to the 

best of our knowledge new, and complements the traditional lines of research, which 

focus on the weight distribution and the divisibility properties of the weights. 

Equivalently, we study distributions taking values in a finite field, which can be 

approximated by distributions coming from constant degree polynomials, where we do 

^ , not bound the number of variables. We give a complete characterization of all such 

O ' distributions. 
On ' 
O 



OO 
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1 Introduction 

We study the weights of Generalized Reed-Muller codes. 

Definition 1 (Generalized Reed-Muller codes). Let F p be a prime finite field. We denote 
by RM p (r, m) the r t/l -order Generalized Reed-Muller code with m variables. This is a linear 
code over F p , whose codewords / G RM p (r,m) : F™ — > ¥ p are evaluations of polynomials 
over F p in m variables of total degree at most r. 

Definition 2 (Weights). Let C be a code. The weight of a codeword / G C is the number 
of non-zero elements in it. For C = RM p (r,m), this is 

wt(/) = |{xGl^:/(x)^0}| 
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One of the main problems in coding theory is understanding the possible weights and 
the distribution of the weights for various families of codes. Generalized Reed-Muller codes 
are one of the more basic family of codes, and has been researched extensively. To quote [15]: 

Reed-Muller (or RM) codes are one of the oldest and best understood families of codes 

Understanding the weights of codewords of Generalized Reed-Muller codes is considered 
to be one of the important questions in coding theory, however our current understanding 
of it is quite limited. There are two traditional lines of research regarding the weights of 
Generalized Reed-Muller codes: their distribution, and their divisibility properties. We 
introduce in this work a third line of research, studying the density of the weights. 

We study the weights of codewords of RM p (r,m) when we fix the order r and let the 
number of variables m tend to infinity. This can be better described in terms of the relative 
weights of the codewords. 

Definition 3 (Relative weights). Let C be a code. The relative weight of a codeword / G C 
is the fraction of non-zero elements in it. For C = RM p (r, m), this is 

rel-wt(/) = G W? : /(*) ^ 0}| = Pr[f(x) ? 0] 

Let W p (r,m) be the set of relative weights of codewords of RM p (r,m): 

W p (r,m) = {rel-wt(/) : / G RM p (r,m)} 

Since RM p (r, m) can be embedded in RM p (r, m + 1), we have W p (r, m) C W p (r, m + 1). 
Thus it makes sense to consider the limit of the weights when m —>■ oo. We define W p (r) to 
be the set of weights of codewords of RM p (r, m) where we do not restrict m, i.e. 

W p (r)= |J W p (r } m) 

meN 

The set W p (r) is contained in the interval [0, 1], and in fact can be further restricted based 
on the minimal relative weight of RM p (r, m), which is well known (see for example [1]). We 
are interested however in the density of the weight set. Our a-priory intuition was that 
the set W p (r) should be relatively dense, since we allow the number of variables to grow 
indefinitely. However, our main theorem shows that the truth is quite far from it. In order 
to state it, we first define the notion of p-rational numbers. 

Definition 4 (p-rational numbers). We say a number a G [0, 1] is p-rational if it is rational 
of the form a = for some integers £, k. 

Theorem 1 (Main theorem). Let a G [0, 1] be a number which is not p-rational. Then there 
exists e > such that W p (r) contains no value in the interval (a — e, a + e). Equivalently, 
there is no sequence of multivariate polynomials fx, f2, ■ ■ ■ over ¥ p of degree at most r, each 
possibly on a different number of variables, such that lim^oo rel-wt(fi-) = a. 
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Thus, around every a G [0, 1] which is not p-rational, there is a "hole", in which there 
are no relative weights of RM p (r,m). 

Another way to view Theorem [T] IS cLS cL theorem about the approximation of random 
variables over finite fields by low-degree polynomials. 

Definition 5 (Distribution of a polynomial). The distribution of a polynomial f(xi, . . . , x m ) 
over W p is defined to be the distribution of / applied to a uniform input in F™. 

Let X be a random variable taking values in ¥ p . We say X can be approximated by 
degree-r polynomials, if its distribution can be arbitrarily approximated by the distribu- 
tion of degree-r polynomials. That is to say, for every e > 0, there exists a multi-variate 
polynomial f(x\, . . . , x m ) over ¥ p of total degree at most r, whose distribution is e-close to 
the distribution of X (for example in statistical distance). The following is an immediate 
corollary of Theorem [TJ 

Corollary 2. Let X be a random variable taking values in ¥ p , which can be approximated 
by degree-r polynomials, for some constant r. Then all the probabilities Pi[X = a] are p- 
rational. In particular, X can be realized as the distribution of a single polynomial over 
F p . 

So for example, we cannot have an arbitrary good approximation of perfect random bits 
by constant degree polynomials over F3, for any constant degree, since 1/2 is not 3-rational. 

Returning to the framework of weights of Generalized Reed-Muller codes, we note that 
although the set W p (r) is sparse, it is not finite. For example, consider the set H^(2), 
the set of relative weights of quadratics over F 2 . The relative weight of f(x\, . . . ,X2k) = 
X\X2 + X3X4 + • • • + x 2 k-iX2k is ffeTT, and the set of these weights is infinite. 

1.1 Related work 

As we mentioned before, the two traditional lines of research regarding the weights of Gen- 
eralized Reed-Muller codes are studying their weight distribution and their divisibility prop- 
erties. We now describe them in more details. 

The weight distribution of RM p (r, m) is the number of codewords below a certain weight. 
The case of r = 1, i.e. of linear functions, is trivial, since all non-constant codewords have 
the same weight. The case of r = 2, i.e. of quadratic functions, is also fully understood. 
A theorem of Dixon [15] gives a canonical characterization of quadratic functions, and in 
particular gives the possible weights and the weight distribution of quadratic functions. 
By the McWilliams identity, this characterize the weight distribution of their dual codes, 
which are RM p (m — 2,m) and RM p (m — 3, m). These are, to the best of our knowledge, 
the only (non-trivial) orders for which complete characterization the weights of Generalized 
Reed-Muller codes is known. For other orders, complete characterization is known only for 
specific values of m. For example, for cubics the record is the work of Sugita, Kasami and 
Fujiwara [T7], characterizing the weight distribution for i?M 2 (3,9). 

Considering general orders, several characteristics of the weights are known. The minimal 
weight of non-zero codewords in RM p (r,m) is known, cLS 3X6 cLS 9X6 the codewords achieving 
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this minimal distance [3]. In the case of Reed-Muller codes, corresponding to p = 2, Kasami 
and Tokura [IT] give a complete characterization of codewords of weight at most twice the 
minimal weight of the code, and Azumi, Kasami and Tokura [JJ gave a characterization of 
codewords of weight at most 2.5 the minimal weight of the code. Recently, Kaufman and 
the author [10] gave a relatively tight estimate on the number of codewords in Reed-Muller 
codes, holding for all weights. 

The second line of research is divisibility of the weights of codewords. Ax [2] proved that 
all weights of codewords / G RM p (r, m) are divisible by pT^-A"!- 1 . This was later generalized 
to general codes [T31 15]. For a survey on divisible codes see [18J or [12J. 

1.2 Organization 

The paper is organized as follows. Theorem [JJ is proved in Section [2J The proof is based on 
a technical lemma which is proved in Section [3J 

2 Proof of Theorem H 

We study codewords / G RM p (r,m). Equivalently, we study polynomials: / is a polynomial 
over W p in m variables of total degree at most r. First, we fix some notations. We denote 
probabilities according to a distribution D by Pv z ^d- For a set S we denote by Us the uniform 
distribution over S, and we shorthand Pr geS for Pr z ^ Us . We let N = {1, 2, . . . } denote the 
set of natural numbers. We will denote elements of F™ by x = (xi, . . . , x m ), and polynomials 
or functions by /(x) = f(xx, . . . , x m ). When we refer to the degree of a polynomial, we will 
always mean its total degree. The relative weight of a polynomial/function / : F™ — > ¥ p is 
the fraction of non-zero elements in it, 

rel-wt(/)= Pr[/(x)^0] 

In order to prove Theorem [T]we will show that for any degree-r polynomial f(x\, x m ), 
there exists a function g(x\, ...,x c ) on a constant number of inputs (i.e. independent of m), 
such that rel-wt(/) ~ rel-wt(g). This is straight-forward if the required approximation is 
fixed a-priory; we show this can be achieved even if the error is allowed to depend arbitrarily 
on the number of inputs c. 

Lemma 3. Let £ : N — > (0, 1) be an arbitrary mapping from the naturals to (0, 1). For any 
constant degree r there exists a constant C = C(¥ p , r, £(■)) such that the following holds: for 
any degree-r polynomial /(x) = f{x\, x m ), there exists c < C and a function g(xi, ...,x c ), 
such that 

\rel-wt(f) — rel-wt(g)\ < £(c) 

Remark. In fact, a somewhat stronger version of the lemma also holds. Not only |rel-wt(/) — 
rel-wt(g)| < £(c). but the statistical distance between the distributions of / and g is 
bounded by £ (c). However, we will not need this stronger version in the proof of Theorem [TJ 
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We now prove Theorem [T] using Lemma [31 

Proof of ThereomUl Let a G (0,1) be a number which is not p-rational, and assume by 
contradiction there exists a sequence of polynomials fi, f 2 , ■ ■ ■ of degree at most r, where 
fk = fk( x ij ■■■, x m k ), whose relative weights converge to a, 

lim rel-wt(/fc) = a. 

fc^oo 

We now define a mapping 5 from the naturals to (0, 1). For every c G N, define 5(c) to 
be the distance of a from the set of rational numbers of the form \. Explicitly, 5(c) is given 
by 

. / [ap c \ \ap c ) 

5(c) = mm < a , a 

I p c p c 

Notice that 5(-) is non-increasing, and by our assumption that a is not p-rational, 5(c) > 
for all c G N. 

Set 8(c) = ^p. Once we fix the mapping £(■), we can use Lemma [3J there exists a 
constant C = C(F p ,r, £(■)), such that for any polynomial fk there exists < C, and a 
function gk(xi, x Ck ), such that 

|rel-wt(/ fc ) - rel-wt(^)| < £(c k ) = ^ (1) 
Since lim^oo rel-wt(/fc) = a, and £(■) is positive, there exists some k such that 

|rel-wt(/ fc )-a|<£(C) = ^ (2) 

Combining ([1]) and ([2]), and since is non-increasing, we get that 

I -. ./ \ , 5(c fe ) 5(C) 5(c fe ) 

We now show this cannot hold, is a function on inputs; 

thus, its relative weight is rational of the form By definition of 5(-): 

I 

|rel-wt((/ fc ) - a\ = \— - a\ > 5(c k ) (4) 

pCk 



Combining ([3]) and (j4j) yields a contradiction. Thus, a must be p-rational. 



□ 



3 Proof of Lemma 3 



The proof of Lemma [3] is based on regularity results for constant degree polynomials by 
Green and Tao [8] and by Kaufman and Lovett [9]. We first make some definitions. In this 
section, all polynomials will be polynomials over ¥ p in m variables. 
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Definition 6 (rank of polynomials). Let /(x) be a degree-r polynomial. The (r — l)-rank 
of /, denoted by rank r _i(f), is the minimal number of degree-(r — 1) polynomials required 
to compute /. This means, rank r _i(f) is the minimal c such that there exists polynomials 
<7i(x), g c (x) of degree at most r — 1 and a function F : F£ — > ¥ p such that 

/(x) = J P(^ 1 (x),...,^(x)) 

Definition 7 (regularity of polynomials). A degree-r polynomial /(x) is T-regular iirank r -i(f) > 
T. A set of polynomials {/i(x), / c (x)} is T-regular if all non-zero linear combinations 
of them are T-regular. This means, for every ai,...,a c G F p not all zero, let /'(x) = 
ai/i(x) + • ■ ■ +a c / c (x). We require that /' is not identically zero, and that if degree(f') = k, 
then rankk-i(f') > T. 

We will need the following result from [8J: any degree-r polynomial / is a function of 
a constant number of regular polynomials gi,...,g c , even if the regularity requirements on 
gi, . . . , g c depend on the number of polynomials c: 

Lemma 4 (Lemma 2.3 in [8]). Let T : N — > N fry an arbitrary mapping. There exists a 
constant C\ = C\(W p ,r,T(-)) such that the following holds. For any degree-r polynomial 
/(x) there exists some c < G\, a set of polynomials <7i(x), ...,gr c (x) o/ degree at most r and 
a function F : F£ — >■ F p; swc/i t/iat: 

1. /(x) = F(^(x),.., 5c (x)) ; 

^. T/ie set of polynomials {gi(x), g c (x)} is T(c)-regular. 

We also need a result relating regularity of polynomials to their joint distribution. 

Definition 8 (distribution of polynomials). Let / : F™ — > F p be a polynomial. Its dis- 
tribution T>(f) is the distribution (taking values in F p ) of applying / on a random input 

x G F ' m 

For a set of polynomials fi, ■ ■ ■ , f c '■ FL" — > F p , their joint distribution . . . , f c ) (taking 

values in F p ) is the distribution of applying fx, . . . , f c on a common random input x G F™, 

• • • > fc) = Cfi(x), • • ■ , / c (x)) x ^ C / F m- 

Definition 9 (statistical distance). Let D',D" be two distributions taking values in the 
same set S. Their statistical distance is 

dist(D', D") = -J2 \ F ^ D ' = s l " F ^ D " = s ll ' 

The following result from [9] shows that polynomials whose distribution is not close to 
uniform must have low rank: 
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Lemma 5 (Theorem 4 in [9]). Let /(x) be a degree-r polynomial such that dist(T>(f), Iff ) > 
e. Then rank r -i(f) < C 2 (F P , r, e). 

We combine Lemma H] and Lemma [5] to prove the following lemma, showing that any 
degree-r polynomial is a function of a constant number of polynomials which are uncorre- 
cted. 

Lemma 6. Let £ : N — ► (0, 1) 6e an arbitrary mapping from the naturals to (0, 1). For any 
constant degree r there exists a constant C = C(¥ p , r, £(•)) such that the following holds: For 
any degree-r polynomial /(x) there exists some c < C , a set of polynomials <?i(x), . . . , g c (x) 
of degree at most r and a function F : F£ — > ¥ p , such that: 

1. /(x) = F(^(x),..., 5c (x)) ; 

2. dist(V(g 1 ,...,g c ),U W c)<E(c). 

Proof. We will choose T : N — > N large enough, to be specified later, and apply Lemma HI 
Let gi, . . . , g c be the polynomials given by the lemma such that /(x) = F(gi(x), . . . , ^(x)), 
and the set {gi, . . . ,g c } is T(c)-regular. We will show that if we choose T(-) large enough, 
we can guarantee that T>(gi, . . . , g c ) is close to uniform. 

We first reduce the task to guaranteeing that all the non-zero linear combinations of 
gi,...,g c are close to uniform. We claim that in order to guarantee that dist(V(gi, . . . , g c ), U^c) < 
£{c), it is enough to guarantee for every non-zero linear combination g'(x) = aigi(x) + • • • + 
a c g c (~x) that dist(V(g'),Uf p ) < p~ c £(c). The proof is by simple Fourier analysis: see for 
example Claim 33 in [3]. 

Given this reduction, we show it is enough to require that g' is regular. Assume dist(V(g'), ) > 
p~ c £(c). Either g' = 0, or, by Lemma EJif degree(g') = k then 

ranh.^g') < C 2 (¥ p , k,p- c £(c)) (5) 

In any case, if we set T(c) = max!< fc < r C 2 (F p , k,p~ c £(c)), we get that the set {gi, . . . , g c } 
is not T(c)-regular, since g' is not T(c)-regular. This is a contradiction to the promise of 
Lemma HI 

Hence we conclude that the joint distribution T>(gi, . . . ,g c ) has statistical distance of at 
most £(c) to the uniform distribution F£, where c < C and 

C = C 1 (¥ p ,d,T(-)) 

□ 

Before proving Lemma [31 we will also need the following simple claim: the statistical dis- 
tance between distributions bounds the probability that a function will be able to distinguish 
between them: 

Claim 7. Let D',D" be two distributions taking values in the same set S. Then for any 
subset S' C S: 

| Pr [z E S'] - Pr [z E S']\ < dist(D',D") 
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We are now ready to prove Lemma [3j 

Proof of Lemma^i Let /(x) be a degree-r polynomial. Apply Lemma El There exists some 
C = C(¥ p ,r,£(-)) such that there is c < C, a set of polynomials «7i(x), g c (x) and a 
function F : F£ — > ¥ p such that 

1. /(x) = F(^(x),...^ c (x)), 

2. dist(V(g 1 ,...,g c ),U ¥ c)<£(c). 

We claim that the function F(yi, . . . ,y c ), where y 1; . . . , y c 6 F p are independent variables, 
have approximately the same relative weight as that of /(x) = F(pi(x), . . . , p c (x)). We 
bound: 



|rel-wt(/) - rel-wt(F)| = 

| Pr ■ [F( gi (x),...,g c (x))^0]- Pr [F(jfr, . . . , y c )] ± 0| = 
xeF™ yi,—,yc& P 

| Pr [(^(x), . . . , g e (pc)) e F-\¥ p \ {0})] - Pr [( yi , . . . , y c ) e F" 1 ^ \ {0})| < 

xeF™ j/l,...,j/ c 6F p 

dist(V(gi, . . . , g c ),V(y u y c )) = 
dist(V(g 1 ,...,g c ),U W c) < £(c). 



□ 



4 Open problems 

We studied in this work the density of the weights of RM p (r, m) where we keep r constant. 
We proved that any a G [0, 1] which is not p-rational, cannot be the limit of relative weights 
of constant degree polynomials. However, we can ask what is the asymptotics of the degrees 
of polynomials that are required to approximate a, i.e, for every e > 0, what should be the 
the degree of /(x) such that |rel-wt(/) — a\ < e, and how do this degree depend on e? 

Another open problem is giving good bounds on the constant C in Lemma [3j We note 
that the current proof depends on Lemma H] and Lemma [51 for which no good bounds are 
currently known. 
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